Skip to content

Add workspace search for initialized git submodules#1000

Open
riley-wv wants to merge 3 commits intopingdotgg:mainfrom
riley-wv:feat/git-submodule-support
Open

Add workspace search for initialized git submodules#1000
riley-wv wants to merge 3 commits intopingdotgg:mainfrom
riley-wv:feat/git-submodule-support

Conversation

@riley-wv
Copy link

@riley-wv riley-wv commented Mar 13, 2026

What Changed

Resolves #997

Workspace search now indexes files inside initialized Git submodules instead of stopping at the submodule boundary.

This PR:

  • recurses through initialized submodules when building the git-backed workspace index
  • keeps submodule roots as directory entries instead of file entries
  • includes tracked and untracked files from initialized submodules in search results
  • adds regression coverage for nested initialized submodules
  • adds regression coverage to verify submodule-local ignore rules are respected during indexing

Why

Files inside initialized submodules were missing from workspace search results even when those submodules were present in the checked-out workspace. Recursing through submodules fixes that gap, and the added tests protect the implementation details most likely to regress: nested traversal and correct ignore handling inside submodules.

Checklist

  • This PR is small and focused
  • I explained what changed and why

Note

Add workspace search for initialized git submodules

  • Adds listInitializedGitSubmodulePaths in workspaceEntries.ts to enumerate initialized submodule directories via git submodule foreach --quiet pwd, resolving them to relative paths.
  • Adds listGitWorkspaceFilePaths to recursively list files from each submodule (up to 8 concurrent scans), respecting each submodule's own .gitignore rules and excluding submodule root paths from the top-level file listing.
  • Refactors buildWorkspaceIndexFromGit to delegate file discovery to listGitWorkspaceFilePaths, so submodule contents appear as directory/file entries rather than opaque file entries.
  • Behavioral Change: submodule roots previously appeared as file entries in the workspace index; they now appear as directory entries containing their actual files.

Macroscope summarized 57e1aef.

Summary by CodeRabbit

  • New Features
    • Workspace indexing now supports Git submodules, including nested submodules, ignore rules, and concurrent processing for enhanced performance.

- recurse through initialized git submodules when building git-backed workspace indexes
- include tracked and untracked files from initialized submodules in workspace search results
- exclude parent repo gitlink entries so submodule roots are indexed as directories, not files
- propagate truncation state across recursive submodule scans
- add regression coverage for submodule indexing and refresh the generated MSW worker asset
@github-actions github-actions bot added the size:L 100-499 changed lines (additions + deletions). label Mar 13, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

This PR implements comprehensive Git submodule support in workspace indexing, introducing utilities to detect initialized submodules, enumerate files within them, and aggregate results with concurrent processing. The changes enable workspace entry resolution for files residing in Git submodules, alongside corresponding test coverage for submodule handling and edge cases.

Changes

Cohort / File(s) Summary
Git Submodule Indexing
apps/server/src/workspaceEntries.ts
Introduces submodule detection (listInitializedGitSubmodulePaths), file enumeration (listGitWorkspaceFilePaths), and path prefixing utilities. Modifies buildWorkspaceIndexFromGit to aggregate file listings from main repository and submodules with concurrent processing via mapWithConcurrency. Adds splitLineSeparatedPaths helper and updates truncation logic to reflect submodule scan states.
Test Infrastructure & Coverage
apps/server/src/workspaceEntries.test.ts
Augments runGit to support optional Git configuration (-c flags). Introduces test helpers: initGitRepo for repository initialization and commitAll for staging/committing changes. Adds test cases for submodule initialization, nested submodules, ignore rules, and concurrent index builds.
Dependency Version
apps/web/public/mockServiceWorker.js
Bumps MSW version from 2.12.9 to 2.12.10.

Sequence Diagram

sequenceDiagram
    participant Repo as Git Repository
    participant Indexer as Workspace Indexer
    participant SubmoduleScanner as Submodule Scanner
    participant Aggregator as File Aggregator
    participant Index as Workspace Index

    Repo->>Indexer: buildWorkspaceIndexFromGit(cwd)
    activate Indexer
    
    Indexer->>SubmoduleScanner: listInitializedGitSubmodulePaths()
    activate SubmoduleScanner
    SubmoduleScanner->>Repo: git config --file .gitmodules
    SubmoduleScanner-->>Indexer: [submodule_paths]
    deactivate SubmoduleScanner
    
    Indexer->>Indexer: listGitWorkspaceFilePaths(main repo)
    Note over Indexer: Enumerate files in main repository
    
    Indexer->>SubmoduleScanner: mapWithConcurrency(listGitWorkspaceFilePaths per submodule)
    activate SubmoduleScanner
    loop For each submodule with concurrency
        SubmoduleScanner->>Repo: git ls-files (per submodule)
        Repo-->>SubmoduleScanner: file_list
    end
    SubmoduleScanner-->>Indexer: [aggregated_file_paths, truncated_state]
    deactivate SubmoduleScanner
    
    Indexer->>Aggregator: Aggregate main repo + submodule files
    activate Aggregator
    Aggregator->>Aggregator: Filter duplicate/submodule paths
    Aggregator-->>Indexer: consolidated_file_paths
    deactivate Aggregator
    
    Indexer->>Index: Build directory & file entries from aggregated paths
    activate Index
    Index-->>Indexer: workspace_index
    deactivate Index
    
    Indexer-->>Repo: Indexed workspace with submodule support
    deactivate Indexer
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Through nested repos the rabbit springs,
Git submodules in tangled strings,
Concurrent scans with careful dance,
Every file gets its indexing chance! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding Git submodule support to workspace search indexing, which directly addresses the linked issue #997.
Linked Issues check ✅ Passed The PR fulfills the primary objective [#997]: workspace search now indexes files inside initialized Git submodules, enabling mention resolution for submodule files.
Out of Scope Changes check ✅ Passed All changes are scoped to implementing Git submodule indexing support; the MSW worker version bump is routine dependency maintenance not out-of-scope.
Description check ✅ Passed The PR description meets the required template structure with clear sections for What Changed, Why, and a completed Checklist. The description adequately explains the changes (Git submodule indexing), the rationale (files in submodules were missing from search), and provides specific implementation details.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the vouch:unvouched PR author is not yet trusted in the VOUCHED list. label Mar 13, 2026
…xing

- add coverage for nested initialized git submodules in workspace search
- verify submodule-local ignore rules are respected when indexing submodule files
- extract small git test helpers to keep submodule setup readable
- harden the existing regression suite against future indexing regressions
@riley-wv
Copy link
Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L 100-499 changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support referencing files inside Git submodules when using @ mentions in chat

1 participant